Lección 7: Aprendizaje por Transferencia – Aprovechando el Conocimiento

Bienvenido a la Lección 7, donde presentamos Aprendizaje por Transferencia. Esta técnica consiste en reutilizar un modelo de aprendizaje profundo que ya ha sido entrenado con un conjunto de datos masivo y general (como ImageNet) y adaptarlo para resolver una nueva tarea específica (como nuestro desafío FoodVision). Es esencial para alcanzar resultados de vanguardia de manera eficiente, especialmente cuando los conjuntos de datos etiquetados son limitados.

1. El Poder de los Pesos Preentrenados

Las redes neuronales profundas aprenden características de forma jerárquica. Las capas inferiores aprenden conceptos fundamentales (bordes, esquinas, texturas), mientras que las capas más profundas combinan estas características en conceptos complejos (ojos, ruedas, objetos específicos). La clave está en que las características fundamentales aprendidas al principio son universalmente aplicables en la mayoría de los dominios visuales.

Componentes del Aprendizaje por Transferencia

Tarea de Origen: Entrenamiento con 14 millones de imágenes y 1000 categorías (por ejemplo, ImageNet).
Tarea Objetivo: Adaptar los pesos para clasificar un conjunto de datos mucho más pequeño (por ejemplo, nuestras clases específicas de FoodVision).
Componente Aprovechado: La gran mayoría de los parámetros de la red—las capas de extracción de características—se reutilizan directamente.

Ahorro de Recursos

El aprendizaje por transferencia reduce drásticamente dos barreras importantes de recursos: Costo Computacional (evitas entrenar todo el modelo durante días) y Requisito de Datos (la alta precisión se puede lograr con cientos, en lugar de miles, de ejemplos de entrenamiento).

TERMINALbash — pytorch-env

> Ready. Click "Run" to execute.

TENSOR INSPECTOR Live

Run code to inspect active tensors

Question 1

What is the primary advantage of using a model pre-trained on ImageNet for a new vision task?

It requires less labeled data than training from scratch.

It completely eliminates the need for any training data.

It guarantees 100% accuracy immediately.

Question 2

In a Transfer Learning workflow, which part of the neural network is typically frozen?

The final Output Layer (Classifier Head).

The Convolutional Base (Feature Extractor layers).

The entire network is usually unfrozen.

Question 3

When replacing the classifier head in PyTorch, what parameter must you first determine from the frozen base?

The batch size of the target data.

The input feature size (the output dimensions of the last convolutional layer).

The total number of model parameters.

Challenge: Adapting the Classifier Head

Designing a new classifier for FoodVision.

You load a ResNet model pre-trained on ImageNet. Its last feature layer outputs a vector of size 512. Your 'FoodVision' project has 7 distinct food classes.

Step 1

What is the required Input Feature size for the new, trainable Linear Layer?

Solution:
The Input Feature size must match the output of the frozen base layer.
Size: 512.

Step 2

What is the PyTorch code snippet to create this new classification layer (assuming the output is named `new_layer`)?

Solution:
The output size of 512 is the input, and the class count 7 is the output.
Code: new_layer = torch.nn.Linear(512, 7)

Step 3

What is the required Output Feature size for the new Linear Layer?

Solution:
The Output Feature size must match the number of target classes.
Size: 7.